Introduction
Puerto Rico, known as a region susceptible to natural disasters, particularly hurricanes, faces significant challenges due to its tropical climate and island geography. The impacts of these disasters on the local population, infrastructure, and ecosystems are profound and often devastating. For example Hurricane Maria, caused catastrophic damages and an important humanitarian crisis, most of the population in the island suffered floods and lack of resources. In response to this, our team is using advanced Machine Learning models to identify areas at high risk of flooding. This initiative aims to provide early warnings to residents, enabling them to take essential precautions. By accurately predicting flood-prone zones, we can guide community members to safety and strategically mitigate the risk of casualties, thereby enhancing disaster preparedness and response efforts across the island.
Pre- and Post-Event NDVI Analysis
##Pre-modeling libraries
#Supressing warnings
import warnings
#Ignoring all warnings
warnings.filterwarnings('ignore')
#Importing GIS tools
import numpy as np
import xarray as xr
import matplotlib.pyplot as plt
import rasterio.features
import rioxarray as rio
from matplotlib.cm import RdYlGn, Reds
#Importing Planetary Computer tools
import pystac_client
import planetary_computer as pc
import odc
from odc.stac import stac_load
#Importing datetime functions
from datetime import date # date-related calculations
##Modeling libraries
#GeoTiff images
import rasterio
from osgeo import gdal
#Importing data visualization libraries
from matplotlib.pyplot import figure
import matplotlib.image as img
from PIL import Image
#Importing libraries for model buildings
import ultralytics
from ultralytics import YOLO
import labelme2yolo
#Other libraries
import os
import shutil
import zipfile
## Hurricane Maria - San Juan, Puerto Rico ##
#Defining the bounding box for the entire data region
min_lon = -66.19385887
min_lat = 18.27306794
max_lon = -66.08007533
max_lat = 18.48024350
#Setting geographic boundary
bounds = (min_lon, min_lat, max_lon, max_lat)
#Setting time window
time_window = "2017-03-31/2018-01-01"
#Connecting to the planetary computer
stac = pystac_client.Client.open("https://planetarycomputer.microsoft.com/api/stac/v1")
#Seaching for data
search = stac.search(collections = ["sentinel-2-l2a"],
bbox = bounds,
datetime = time_window)
#Instantiating results list
items = list(search.get_all_items())
#Pixel resolution for the final product
resolution = 10 # meters per pixel
#Scaling to degrees per pizel
scale = resolution / 111320.0 # degrees per pixel for CRS:4326
#Stablishing xx
xx = stac_load(
items,
bands = ["red", "green", "blue", "nir", "SCL"],
crs = "EPSG:4326", # latitude-longitude
resolution = scale, # degrees
chunks = {"x": 2048, "y": 2048},
dtype = "uint16",
patch_url = pc.sign,
bbox = bounds
)
#Instantiating a colormap for SCL pixel classifications
scl_colormap = np.array(
[
[252, 40, 228, 255], # 0 - NODATA - MAGENTA
[255, 0, 4, 255], # 1 - Saturated or Defective - RED
[0 , 0, 0, 255], # 2 - Dark Areas - BLACK
[97 , 97, 97, 255], # 3 - Cloud Shadow - DARK GREY
[3 , 139, 80, 255], # 4 - Vegetation - GREEN
[192, 132, 12, 255], # 5 - Bare Ground - BROWN
[21 , 103, 141, 255], # 6 - Water - BLUE
[117, 0, 27, 255], # 7 - Unclassified - MAROON
[208, 208, 208, 255], # 8 - Cloud - LIGHT GREY
[244, 244, 244, 255], # 9 - Definitely Cloud - WHITE
[195, 231, 240, 255], # 10 - Thin Cloud - LIGHT BLUE
[222, 157, 204, 255], # 11 - Snow or Ice - PINK
],
dtype="uint8",
)
#Filtering out water, etc.
filter_values = [0, 1, 3, 6, 8, 9, 10]
#Defining cloud mask for filter values
cloud_mask = ~xx.SCL.isin(filter_values)
#Applying cloud mask to filter out water, clouds and cloud shadows
# storing as 16-bit integers
cleaned_data = xx.where(cloud_mask).astype("uint16")
#Preparing two time steps compare NDVI outputs
first_time = 0 #
second_time = 38 #
first_time_2 = 11
second_time_2 = 30
first_time_3 = 18
second_time_3 = 39
#Plots of NDVI at two different time slices (1)
#Setting figure size
fig, ax = plt.subplots(1, 2, figsize = (15, 10))
#First image data
ndvi_image = (cleaned_data.nir - cleaned_data.red) / (cleaned_data.nir + cleaned_data.red)
ndvi_image.isel(time = first_time ).plot(ax = ax[0],
vmin = 0.0,
vmax = 0.8,
cmap = "RdYlGn")
#Second image data
ndvi_image.isel(time = second_time).plot(ax = ax[1],
vmin = 0.0,
vmax = 0.8,
cmap = "RdYlGn")
#Axis labels
ax[0].set_title(label = 'NDVI-Time #1'), ax[0].xaxis.set_visible(False), ax[0].yaxis.set_visible(False)
ax[1].set_title(label = 'NDVI-Time #2'), ax[1].xaxis.set_visible(False), ax[1].yaxis.set_visible(False)
#Showing the plot
plt.show()
#Plots of NDVI at two different time slices (2)
#Setting figure size
fig, ax = plt.subplots(1, 2, figsize = (15, 10))
#Third image data
ndvi_image = (cleaned_data.nir - cleaned_data.red) / (cleaned_data.nir + cleaned_data.red)
ndvi_image.isel(time = first_time_2 ).plot(ax = ax[0],
vmin = 0.0,
vmax = 0.8,
cmap = "RdYlGn")
#Fourth image data
ndvi_image.isel(time = second_time_2).plot(ax = ax[1],
vmin = 0.0,
vmax = 0.8,
cmap = "RdYlGn")
#Axis labels
ax[0].set_title(label = 'NDVI-Time #3'), ax[0].xaxis.set_visible(False), ax[0].yaxis.set_visible(False)
ax[1].set_title(label = 'NDVI-Time #4'), ax[1].xaxis.set_visible(False), ax[1].yaxis.set_visible(False)
#Showing the plot
plt.show()
#Plots of NDVI at two different time slices (3)
#Setting figure size
fig, ax = plt.subplots(1, 2, figsize = (15, 10))
#Fifth image data
ndvi_image = (cleaned_data.nir - cleaned_data.red) / (cleaned_data.nir + cleaned_data.red)
ndvi_image.isel(time = first_time_3 ).plot(ax = ax[0],
vmin = 0.0,
vmax = 0.8,
cmap = "RdYlGn")
#Sixth image data
ndvi_image.isel(time = second_time_3).plot(ax = ax[1],
vmin = 0.0,
vmax = 0.8,
cmap = "RdYlGn")
#Axis labels
ax[0].set_title(label = 'NDVI-Time #5'), ax[0].xaxis.set_visible(False), ax[0].yaxis.set_visible(False)
ax[1].set_title(label = 'NDVI-Time #6'), ax[1].xaxis.set_visible(False), ax[1].yaxis.set_visible(False)
#Showing the plot
plt.show()
#Function for calculating NDVI anomalies
def NDVI(dataset):
return (dataset.nir - dataset.red) / (dataset.nir + dataset.red)
#Running comparison
ndvi_clean = NDVI(cleaned_data)
#Calculating difference (1)
ndvi_pre = ndvi_clean.isel(time = first_time)
ndvi_post = ndvi_clean.isel(time = second_time)
ndvi_anomaly = ndvi_post - ndvi_pre
#All areas of water or clouds will be black
RdYlGn.set_bad('black',1.)
#Reversing the colormap for reds
Reds_reverse = "Reds_r"
##Plotting NDVI anomaly (1)
plt.figure( figsize = (6,10) )
ndvi_anomaly.plot(vmin = -0.2, vmax=0.0, cmap = Reds_reverse, add_colorbar=False)
#Titles and labels
plt.title (label = "NDVI Anomaly")
plt.xlabel(xlabel = "Longitude")
plt.ylabel(ylabel = "Latitude")
#Showing the plot
plt.show()
#Calculating difference (2)
ndvi_pre = ndvi_clean.isel(time = first_time_2)
ndvi_post = ndvi_clean.isel(time = second_time_2)
ndvi_anomaly = ndvi_post - ndvi_pre
#All areas of water or clouds will be black
RdYlGn.set_bad('black',1.)
#Reversing the colormap for reds
Reds_reverse = "Reds_r"
##Plotting NDVI anomaly (2)
plt.figure( figsize = (6,10) )
ndvi_anomaly.plot(vmin = -0.2, vmax=0.0, cmap = Reds_reverse, add_colorbar=False)
#Titles and labels
plt.title (label = "NDVI Anomaly")
plt.xlabel(xlabel = "Longitude")
plt.ylabel(ylabel = "Latitude")
#Showing the plot
plt.show()
#Calculating difference (3)
ndvi_pre = ndvi_clean.isel(time = first_time_3)
ndvi_post = ndvi_clean.isel(time = second_time_3)
ndvi_anomaly = ndvi_post - ndvi_pre
#All areas of water or clouds will be black
RdYlGn.set_bad('black',1.)
#Reversing the colormap for reds
Reds_reverse = "Reds_r"
##Plotting NDVI anomaly (3)
plt.figure( figsize = (6,10) )
ndvi_anomaly.plot(vmin = -0.2, vmax=0.0, cmap = Reds_reverse, add_colorbar=False)
#Titles and labels
plt.title (label = "NDVI Anomaly")
plt.xlabel(xlabel = "Longitude")
plt.ylabel(ylabel = "Latitude")
#Showing the plot
plt.show()
Actionable Recommendations
The first actionable recommendation is to increase funds for the “Asistencia para la Mitigación de Riesgos” (FEMA), a public program dedicated to mitigating losses during natural disasters. It would aim to incentivize the allocation of monetary resources to effectively carry out evacuations plans and ensure the availability of motivated personnel. By offering incentives within the program, individuals would be encouraged to prioritize evacuation preparedness. Such a program is intended to foster a culture of readiness and proactive response to potential emergencies or disasters. The objective of the program is to enhance state presence and strengthen community cooperation. (Gobierno de Puerto Rico)
The second actionable recommendation, directly tied with the first one, would be to establish an evacuation plan and gather emergency supplies. The purpose is to lead the residents that live near the coastline and areas at risk of river overflows to relocate in areas which are not directly affected by a sudden rise of sea-levels and firmer land. These areas are found along the city outskirts and appear less affected by the hurricane in general, but essential supplies such as drinkable water and food are not readily available. Determining accessible routes is an important aspect of this plan. In many cases, people reside in areas that would have immediate access to water, which makes this effort a priority. It is expected that introducing more preparedness strategies to the community will help reduce mental and physical detriment after the event, as seen by Joshipura et al. (2022)
Thirdly, as a consequence of the second actionable recommendation, it's crucial to spread the word about hurricane dangers. Whether through radio ads, social media or community outreach, getting the message out there is essential. Many people don't realize just how serious hurricanes can be, so educating them about the risks is key. This way, when evacuation plans spring into action, people will understand why it's so important to follow them. It's all about making sure everyone's on the same page when it comes to staying safe during a storm. Social media has proved to be an invaluable tool for communication during and after a natural disaster as studied by Bacon & Mujkic (2016), helping nonprofit organizations to act immediately providing the necessary assistance.
Object Detection (Image Labeling)
Image labeling was done using Label me, 25 pre-event images and 25 post-event images where labeled and saved into the "manually labeled" folder. These images were not used to train the model, instead we utilized a dataset from roboflow, the reference for it an be found in the README text files inside the zip file. The images downloaded from roboflow are most likely the results of another object detection model, given that some images are missing some clear labels and others have backgrounds labeled as buildings. The data set also used a different approach from ours to label the pictures, using rectangles that incorporate some of the background into them, while we used the manual tool to trace the edges of the buildings without incorporating the surroundings. Although the labeling in these images is not perfect it significantly improved our mAP score.
Model Building
#Loading the model
model = YOLO('yolov8n.pt')
#Hyperparameter Tuning (This step was done using an External GPU so the outputs are not visible in this notebook)
#model.tune(data='/Users/juanmanumango/Desktop/Hult/Business Challenege III/Model/data.yaml', epochs=50, iterations=60, optimizer='SGD', val=False)
# Train the model on the dataset for 50 epochs (With best params from tuning)
results = model.train(data = './data.yaml',
seed = 702, #Seed to replicate results
imgsz = 640, #Image size
epochs = 50, #Number of epochs
optimizer = 'SGD',
lr0 = 0.00922, #Critical
lrf = 0.00707,
momentum = 0.98,
weight_decay = 0.00052,
warmup_epochs = 3.0281,
warmup_momentum = 0.95,
box = 8.78755, #Might improve mAP score
cls = 0.44802, #Same
dfl = 1.64183,
hsv_h = 0.00859, #For different lighting
hsv_s = 0.79619,
hsv_v = 0.38164,
degrees = 0.0, #For objectives in different angles
translate = 0.13664,
scale = 0.33876,
shear = 0.0,
perspective = 0.0,
flipud = 0.0,
fliplr = 0.40009,
bgr = 0.0,
mosaic = 0.87205, #Default 1
mixup = 0, #Thought this would be useful
copy_paste = 0
)
New https://pypi.org/project/ultralytics/8.2.4 available 😃 Update with 'pip install -U ultralytics'
Ultralytics YOLOv8.2.0 🚀 Python-3.11.8 torch-2.2.2 CPU (Apple M2)
engine/trainer: task=detect, mode=train, model=yolov8n.pt, data=./data.yaml, epochs=50, time=None, patience=100, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=train63, exist_ok=False, pretrained=True, optimizer=SGD, verbose=True, seed=702, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, show_boxes=True, line_width=None, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=None, workspace=4, nms=False, lr0=0.00922, lrf=0.00707, momentum=0.98, weight_decay=0.00052, warmup_epochs=3.0281, warmup_momentum=0.95, warmup_bias_lr=0.1, box=8.78755, cls=0.44802, dfl=1.64183, pose=12.0, kobj=1.0, label_smoothing=0.0, nbs=64, hsv_h=0.00859, hsv_s=0.79619, hsv_v=0.38164, degrees=0.0, translate=0.13664, scale=0.33876, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.40009, bgr=0.0, mosaic=0.87205, mixup=0, copy_paste=0, auto_augment=randaugment, erasing=0.4, crop_fraction=1.0, cfg=None, tracker=botsort.yaml, save_dir=/Users/juanmanumango/runs/detect/train63
Overriding model.yaml nc=80 with nc=4
from n params module arguments
0 -1 1 464 ultralytics.nn.modules.conv.Conv [3, 16, 3, 2]
1 -1 1 4672 ultralytics.nn.modules.conv.Conv [16, 32, 3, 2]
2 -1 1 7360 ultralytics.nn.modules.block.C2f [32, 32, 1, True]
3 -1 1 18560 ultralytics.nn.modules.conv.Conv [32, 64, 3, 2]
4 -1 2 49664 ultralytics.nn.modules.block.C2f [64, 64, 2, True]
5 -1 1 73984 ultralytics.nn.modules.conv.Conv [64, 128, 3, 2]
6 -1 2 197632 ultralytics.nn.modules.block.C2f [128, 128, 2, True]
7 -1 1 295424 ultralytics.nn.modules.conv.Conv [128, 256, 3, 2]
8 -1 1 460288 ultralytics.nn.modules.block.C2f [256, 256, 1, True]
9 -1 1 164608 ultralytics.nn.modules.block.SPPF [256, 256, 5]
10 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
11 [-1, 6] 1 0 ultralytics.nn.modules.conv.Concat [1]
12 -1 1 148224 ultralytics.nn.modules.block.C2f [384, 128, 1]
13 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
14 [-1, 4] 1 0 ultralytics.nn.modules.conv.Concat [1]
15 -1 1 37248 ultralytics.nn.modules.block.C2f [192, 64, 1]
16 -1 1 36992 ultralytics.nn.modules.conv.Conv [64, 64, 3, 2]
17 [-1, 12] 1 0 ultralytics.nn.modules.conv.Concat [1]
18 -1 1 123648 ultralytics.nn.modules.block.C2f [192, 128, 1]
19 -1 1 147712 ultralytics.nn.modules.conv.Conv [128, 128, 3, 2]
20 [-1, 9] 1 0 ultralytics.nn.modules.conv.Concat [1]
21 -1 1 493056 ultralytics.nn.modules.block.C2f [384, 256, 1]
22 [15, 18, 21] 1 752092 ultralytics.nn.modules.head.Detect [4, [64, 128, 256]]
Model summary: 225 layers, 3011628 parameters, 3011612 gradients, 8.2 GFLOPs
Transferred 319/355 items from pretrained weights
Freezing layer 'model.22.dfl.conv.weight'
train: Scanning /Users/juanmanumango/Desktop/A1_Team2/labels/train... 200 images
train: New cache created: /Users/juanmanumango/Desktop/A1_Team2/labels/train.cache
val: Scanning /Users/juanmanumango/Desktop/A1_Team2/labels/val... 55 images, 0 b
val: New cache created: /Users/juanmanumango/Desktop/A1_Team2/labels/val.cache
WARNING ⚠️ Box and segment counts should be equal, but got len(segments) = 192, len(boxes) = 914. To resolve this only boxes will be used and all segments will be removed. To avoid this please supply either a detect or segment dataset, not a detect-segment mixed dataset.
Plotting labels to /Users/juanmanumango/runs/detect/train63/labels.jpg... optimizer: SGD(lr=0.00922, momentum=0.98) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.00052), 63 bias(decay=0.0) Image sizes 640 train, 640 val Using 0 dataloader workers Logging results to /Users/juanmanumango/runs/detect/train63 Starting training for 50 epochs... Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
1/50 0G 2.042 3.722 1.912 697 640:
--------------------------------------------------------------------------- KeyboardInterrupt Traceback (most recent call last) Cell In[8], line 2 1 # Train the model on the dataset for 50 epochs (With best params from tuning) ----> 2 results = model.train(data = './data.yaml', 3 seed = 702, #Seed to replicate results 4 imgsz = 640, #Image size 5 epochs = 50, #Number of epochs 6 optimizer = 'SGD', 7 lr0 = 0.00922, #Critical 8 lrf = 0.00707, 9 momentum = 0.98, 10 weight_decay = 0.00052, 11 warmup_epochs = 3.0281, 12 warmup_momentum = 0.95, 13 box = 8.78755, #Might improve mAP score 14 cls = 0.44802, #Same 15 dfl = 1.64183, 16 hsv_h = 0.00859, #For different lighting 17 hsv_s = 0.79619, 18 hsv_v = 0.38164, 19 degrees = 0.0, #For objectives in different angles 20 translate = 0.13664, 21 scale = 0.33876, 22 shear = 0.0, 23 perspective = 0.0, 24 flipud = 0.0, 25 fliplr = 0.40009, 26 bgr = 0.0, 27 mosaic = 0.87205, #Default 1 28 mixup = 0, #Thought this would be useful 29 copy_paste = 0 30 ) File ~/anaconda3/lib/python3.11/site-packages/ultralytics/engine/model.py:673, in Model.train(self, trainer, **kwargs) 670 pass 672 self.trainer.hub_session = self.session # attach optional HUB session --> 673 self.trainer.train() 674 # Update model and cfg after training 675 if RANK in {-1, 0}: File ~/anaconda3/lib/python3.11/site-packages/ultralytics/engine/trainer.py:198, in BaseTrainer.train(self) 195 ddp_cleanup(self, str(file)) 197 else: --> 198 self._do_train(world_size) File ~/anaconda3/lib/python3.11/site-packages/ultralytics/engine/trainer.py:370, in BaseTrainer._do_train(self, world_size) 368 with torch.cuda.amp.autocast(self.amp): 369 batch = self.preprocess_batch(batch) --> 370 self.loss, self.loss_items = self.model(batch) 371 if RANK != -1: 372 self.loss *= world_size File ~/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py:1511, in Module._wrapped_call_impl(self, *args, **kwargs) 1509 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc] 1510 else: -> 1511 return self._call_impl(*args, **kwargs) File ~/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py:1520, in Module._call_impl(self, *args, **kwargs) 1515 # If we don't have any hooks, we want to skip the rest of the logic in 1516 # this function, and just call forward. 1517 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks 1518 or _global_backward_pre_hooks or _global_backward_hooks 1519 or _global_forward_hooks or _global_forward_pre_hooks): -> 1520 return forward_call(*args, **kwargs) 1522 try: 1523 result = None File ~/anaconda3/lib/python3.11/site-packages/ultralytics/nn/tasks.py:88, in BaseModel.forward(self, x, *args, **kwargs) 78 """ 79 Forward pass of the model on a single scale. Wrapper for `_forward_once` method. 80 (...) 85 (torch.Tensor): The output of the network. 86 """ 87 if isinstance(x, dict): # for cases of training and validating while training. ---> 88 return self.loss(x, *args, **kwargs) 89 return self.predict(x, *args, **kwargs) File ~/anaconda3/lib/python3.11/site-packages/ultralytics/nn/tasks.py:267, in BaseModel.loss(self, batch, preds) 264 self.criterion = self.init_criterion() 266 preds = self.forward(batch["img"]) if preds is None else preds --> 267 return self.criterion(preds, batch) File ~/anaconda3/lib/python3.11/site-packages/ultralytics/utils/loss.py:222, in v8DetectionLoss.__call__(self, preds, batch) 219 # Pboxes 220 pred_bboxes = self.bbox_decode(anchor_points, pred_distri) # xyxy, (b, h*w, 4) --> 222 _, target_bboxes, target_scores, fg_mask, _ = self.assigner( 223 pred_scores.detach().sigmoid(), 224 (pred_bboxes.detach() * stride_tensor).type(gt_bboxes.dtype), 225 anchor_points * stride_tensor, 226 gt_labels, 227 gt_bboxes, 228 mask_gt, 229 ) 231 target_scores_sum = max(target_scores.sum(), 1) 233 # Cls loss 234 # loss[1] = self.varifocal_loss(pred_scores, target_scores, target_labels) / target_scores_sum # VFL way File ~/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py:1511, in Module._wrapped_call_impl(self, *args, **kwargs) 1509 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc] 1510 else: -> 1511 return self._call_impl(*args, **kwargs) File ~/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py:1520, in Module._call_impl(self, *args, **kwargs) 1515 # If we don't have any hooks, we want to skip the rest of the logic in 1516 # this function, and just call forward. 1517 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks 1518 or _global_backward_pre_hooks or _global_backward_hooks 1519 or _global_forward_hooks or _global_forward_pre_hooks): -> 1520 return forward_call(*args, **kwargs) 1522 try: 1523 result = None File ~/anaconda3/lib/python3.11/site-packages/torch/utils/_contextlib.py:115, in context_decorator.<locals>.decorate_context(*args, **kwargs) 112 @functools.wraps(func) 113 def decorate_context(*args, **kwargs): 114 with ctx_factory(): --> 115 return func(*args, **kwargs) File ~/anaconda3/lib/python3.11/site-packages/ultralytics/utils/tal.py:72, in TaskAlignedAssigner.forward(self, pd_scores, pd_bboxes, anc_points, gt_labels, gt_bboxes, mask_gt) 63 device = gt_bboxes.device 64 return ( 65 torch.full_like(pd_scores[..., 0], self.bg_idx).to(device), 66 torch.zeros_like(pd_bboxes).to(device), (...) 69 torch.zeros_like(pd_scores[..., 0]).to(device), 70 ) ---> 72 mask_pos, align_metric, overlaps = self.get_pos_mask( 73 pd_scores, pd_bboxes, gt_labels, gt_bboxes, anc_points, mask_gt 74 ) 76 target_gt_idx, fg_mask, mask_pos = self.select_highest_overlaps(mask_pos, overlaps, self.n_max_boxes) 78 # Assigned target File ~/anaconda3/lib/python3.11/site-packages/ultralytics/utils/tal.py:92, in TaskAlignedAssigner.get_pos_mask(self, pd_scores, pd_bboxes, gt_labels, gt_bboxes, anc_points, mask_gt) 90 def get_pos_mask(self, pd_scores, pd_bboxes, gt_labels, gt_bboxes, anc_points, mask_gt): 91 """Get in_gts mask, (b, max_num_obj, h*w).""" ---> 92 mask_in_gts = self.select_candidates_in_gts(anc_points, gt_bboxes) 93 # Get anchor_align metric, (b, max_num_obj, h*w) 94 align_metric, overlaps = self.get_box_metrics(pd_scores, pd_bboxes, gt_labels, gt_bboxes, mask_in_gts * mask_gt) File ~/anaconda3/lib/python3.11/site-packages/ultralytics/utils/tal.py:229, in TaskAlignedAssigner.select_candidates_in_gts(xy_centers, gt_bboxes, eps) 227 bbox_deltas = torch.cat((xy_centers[None] - lt, rb - xy_centers[None]), dim=2).view(bs, n_boxes, n_anchors, -1) 228 # return (bbox_deltas.min(3)[0] > eps).to(gt_bboxes.dtype) --> 229 return bbox_deltas.amin(3).gt_(eps) KeyboardInterrupt:
The model was also ran using the external GPU to see the outputs of this model please refer to the SGD.ipynb file. Nevertheless the model should still run without any issues using a small sample of the bigger dataset used in our actual model.
Model Analysis
#Plotting model results
figure(figsize=(15, 10), dpi=80)
# reading the image
results = img.imread(fname = './train124/results.png', format = 'png') # change this as needed
# displaying the image
plt.imshow(results)
#plt.axis('off') # Turn off axis numbers and ticks
plt.show()
The results for the model where excellent, the best mAP score ended up being 0.789 (This can be seen in the SGD.ipynb file). This was mainly due to the quality of our data. The dataset downloaded from roboflow provided us with a large amount of training data that was labeled almost perfectly. This was the biggest factor in improving our score. Hyperparemeter tuning ended up not being as significant as we would have expected, only improving our model by less than 10%. With more time, we could executed a better tuning process that might have significantly improved our model.
#Plotting confusion matrix
figure(figsize=(20,15), dpi=80)
# reading the image
cf = img.imread('./train124/confusion_matrix.png') # change this as needed
# displaying the image
plt.imshow(cf)
plt.show()
Looking at the confusion matrix we can observe the areas of improvement in the model. Overall the predictions were good but there were a lot of background points that were labeled as buildings. In addition, there are a couple of false predictions regarding residential buildings. Adjusting class weights to be less sensitive would improve our precision score and therefore improve our mAP score.
Making Predictions on the Submission Data
#Loading best model
model = YOLO('./train124/weights/best.pt') # change this as needed
# # Decoding according to the .yaml file class names order
# decoding_of_predictions ={0: 'undamagedcommercialbuilding', 1: 'undamagedresidentialbuilding', 2: 'damagedresidentialbuilding', 3: 'damagedcommercialbuilding'}
# directory = './Submission Data (12 images)'
# # Directory to store outputs
# results_directory = 'Validation_Data_Results'
# # Create submission directory if it doesn't exist
# if not os.path.exists(results_directory):
# os.makedirs(results_directory)
# # Loop through each file in the directory
# for filename in os.listdir(directory):
# # Check if the current object is a file and ends with .jpeg
# if os.path.isfile(os.path.join(directory, filename)) and filename.lower().endswith('.jpg'):
# # Perform operations on the file
# file_path = os.path.join(directory, filename)
# print(file_path)
# print("Making a prediction on ", filename)
# results = model.predict(file_path, save=True, iou=0.5, save_txt=True, conf=0.25)
# for r in results:
# conf_list = r.boxes.conf.numpy().tolist()
# clss_list = r.boxes.cls.numpy().tolist()
# original_list = clss_list
# updated_list = []
# for element in original_list:
# updated_list.append(decoding_of_predictions[int(element)])
# bounding_boxes = r.boxes.xyxy.numpy()
# confidences = conf_list
# class_names = updated_list
# # Check if bounding boxes, confidences and class names match
# if len(bounding_boxes) != len(confidences) or len(bounding_boxes) != len(class_names):
# print("Error: Number of bounding boxes, confidences, and class names should be the same.")
# continue
# text_file_name = os.path.splitext(filename)[0]
# # Creating a new .txt file for each image in the submission_directory
# with open(os.path.join(results_directory, f"{text_file_name}.txt"), "w") as file:
# for i in range(len(bounding_boxes)):
# # Get coordinates of each bounding box
# left, top, right, bottom = bounding_boxes[i]
# # Write content to file in desired format
# file.write(f"{class_names[i]} {confidences[i]} {left} {top} {right} {bottom}\n")
# print("Output files generated successfully.")
./Submission Data (12 images)/Validation_Post_Event_009.jpg Making a prediction on Validation_Post_Event_009.jpg image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_009.jpg: 640x640 6 damagedcommercialbuildings, 5 damagedresidentialbuildings, 12 undamagedcommercialbuildings, 9 undamagedresidentialbuildings, 88.9ms Speed: 4.5ms preprocess, 88.9ms inference, 1.3ms postprocess per image at shape (1, 3, 640, 640) Results saved to /Users/juanmanumango/runs/detect/predict3 12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels Output files generated successfully. ./Submission Data (12 images)/Validation_Post_Event_008.jpg Making a prediction on Validation_Post_Event_008.jpg image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_008.jpg: 640x640 2 damagedcommercialbuildings, 7 damagedresidentialbuildings, 2 undamagedcommercialbuildings, 14 undamagedresidentialbuildings, 76.0ms Speed: 1.6ms preprocess, 76.0ms inference, 0.3ms postprocess per image at shape (1, 3, 640, 640) Results saved to /Users/juanmanumango/runs/detect/predict3 12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels Output files generated successfully. ./Submission Data (12 images)/Validation_Post_Event_011.jpg Making a prediction on Validation_Post_Event_011.jpg image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_011.jpg: 640x640 3 damagedcommercialbuildings, 6 damagedresidentialbuildings, 7 undamagedcommercialbuildings, 6 undamagedresidentialbuildings, 77.7ms Speed: 1.7ms preprocess, 77.7ms inference, 0.3ms postprocess per image at shape (1, 3, 640, 640) Results saved to /Users/juanmanumango/runs/detect/predict3 12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels Output files generated successfully. ./Submission Data (12 images)/Validation_Post_Event_005.jpg Making a prediction on Validation_Post_Event_005.jpg image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_005.jpg: 640x640 10 damagedresidentialbuildings, 2 undamagedcommercialbuildings, 37 undamagedresidentialbuildings, 78.6ms Speed: 1.7ms preprocess, 78.6ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 640) Results saved to /Users/juanmanumango/runs/detect/predict3 12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels Output files generated successfully. ./Submission Data (12 images)/Validation_Post_Event_004.jpg Making a prediction on Validation_Post_Event_004.jpg image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_004.jpg: 640x640 5 damagedresidentialbuildings, 12 undamagedresidentialbuildings, 82.3ms Speed: 1.7ms preprocess, 82.3ms inference, 0.3ms postprocess per image at shape (1, 3, 640, 640) Results saved to /Users/juanmanumango/runs/detect/predict3 12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels Output files generated successfully. ./Submission Data (12 images)/Validation_Post_Event_010.jpg Making a prediction on Validation_Post_Event_010.jpg image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_010.jpg: 640x640 9 damagedcommercialbuildings, 4 damagedresidentialbuildings, 9 undamagedcommercialbuildings, 5 undamagedresidentialbuildings, 82.7ms Speed: 1.7ms preprocess, 82.7ms inference, 0.3ms postprocess per image at shape (1, 3, 640, 640) Results saved to /Users/juanmanumango/runs/detect/predict3 12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels Output files generated successfully. ./Submission Data (12 images)/Validation_Post_Event_006.jpg Making a prediction on Validation_Post_Event_006.jpg image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_006.jpg: 640x640 6 damagedcommercialbuildings, 14 damagedresidentialbuildings, 4 undamagedcommercialbuildings, 32 undamagedresidentialbuildings, 119.2ms Speed: 1.8ms preprocess, 119.2ms inference, 0.7ms postprocess per image at shape (1, 3, 640, 640) Results saved to /Users/juanmanumango/runs/detect/predict3 12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels Output files generated successfully. ./Submission Data (12 images)/Validation_Post_Event_012.jpg Making a prediction on Validation_Post_Event_012.jpg image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_012.jpg: 640x640 2 damagedcommercialbuildings, 12 damagedresidentialbuildings, 7 undamagedcommercialbuildings, 22 undamagedresidentialbuildings, 79.2ms Speed: 2.3ms preprocess, 79.2ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 640) Results saved to /Users/juanmanumango/runs/detect/predict3 12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels Output files generated successfully. ./Submission Data (12 images)/Validation_Post_Event_007.jpg Making a prediction on Validation_Post_Event_007.jpg image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_007.jpg: 640x640 1 damagedcommercialbuilding, 26 damagedresidentialbuildings, 2 undamagedcommercialbuildings, 28 undamagedresidentialbuildings, 98.2ms Speed: 1.6ms preprocess, 98.2ms inference, 0.5ms postprocess per image at shape (1, 3, 640, 640) Results saved to /Users/juanmanumango/runs/detect/predict3 12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels Output files generated successfully. ./Submission Data (12 images)/Validation_Post_Event_003.jpg Making a prediction on Validation_Post_Event_003.jpg image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_003.jpg: 640x640 12 damagedresidentialbuildings, 12 undamagedresidentialbuildings, 102.1ms Speed: 1.7ms preprocess, 102.1ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 640) Results saved to /Users/juanmanumango/runs/detect/predict3 12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels Output files generated successfully. ./Submission Data (12 images)/Validation_Post_Event_002.jpg Making a prediction on Validation_Post_Event_002.jpg image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_002.jpg: 640x640 1 damagedcommercialbuilding, 11 damagedresidentialbuildings, 10 undamagedresidentialbuildings, 91.4ms Speed: 3.1ms preprocess, 91.4ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 640) Results saved to /Users/juanmanumango/runs/detect/predict3 12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels Output files generated successfully. ./Submission Data (12 images)/Validation_Post_Event_001.jpg Making a prediction on Validation_Post_Event_001.jpg image 1/1 /Users/juanmanumango/Desktop/A1/Submission Data (12 images)/Validation_Post_Event_001.jpg: 640x640 18 damagedresidentialbuildings, 1 undamagedcommercialbuilding, 34 undamagedresidentialbuildings, 88.5ms Speed: 1.6ms preprocess, 88.5ms inference, 0.5ms postprocess per image at shape (1, 3, 640, 640) Results saved to /Users/juanmanumango/runs/detect/predict3 12 labels saved to /Users/juanmanumango/runs/detect/predict3/labels Output files generated successfully.
The submission results can be found in the zip file as under the "Best Model Results" folder.
Conclusion
After finishing this project, the most notorious insights that we encountered were as follows:
According the National Oceanic and Atmospheric Administration, within an 18 hour period, Hurricane Maria became a category 5 hurricane. This rapid intensification highlights the need for swift response mechanisms in hurricane-prone regions.
Based on the findings on our code, the buildings in San Juan that suffered the most damage were commercial. Of a total of 14,997 residential buildings and 2,411 comercial buildings, 14% of residential buildings were damaged and 19% of commercial buildings were damaged. This shows the vulnerability of infrastructure and the need for robust building codes and construction practices within the area.
As reported by RAND Puerto Rico’s municipal governments were unprepared for a disaster of this magnitude. This lack of preparedness shows the effects of the hurricane and hindered effective response and recovery efforts.
Based on these insights, our team proposes several actionable recommendations to enhance disaster preparedness and response in Puerto Rico. These include refining the existing evacuation plans to be more region-specific, which would allow for more targeted and efficient evacuations. Also, we suggest implementing area-specific disaster management plans that address the unique needs and vulnerabilities of different regions and their people. As a final recommendation, an educational program should be established to raise awareness about disaster preparedness and provide training on how to effectively respond to emergencies. Such initiatives are crucial for building resilience and ensuring a more robust response to future natural disasters. By implementing these strategies, we believe Puerto Rico can better safeguard its communities and infrastructure against the inevitable challenges posed by future storms.
Steps we would take to improve the model if we had more time
Enhance data collection:
We would expand our data collection efforts to include not only NDVI but also other relevant indices like soil moisture and land surface temperature. This would provide a more comprehensive dataset for analysis, allowing for better predictions and insights into the effects of hurricanes on different types of terrain and vegetation.
Refine our predictive model:
The main challenge for us was getting used to the tools and methods necessary for this assignment. Learning how to utilize external GPU's and to use completely new functions to us such as YOLO required a lot of time practicing and understanding. With more time we believe we could have done a much better job tuning our model. Using tools such as ray tune, which we didn't manage to get working properly, could have substantially improved our model's performance.
Adjusting class weights is another improvement we believe would critically increase our score, making the model less sensitive towards some classes would lead to more accurate predictions. If we counted with more time we are confident we could have added this feature to our model.
Apart from these more technical improvements if we counted with 3 months to work on this project the ideal situation would have been to manually label more images and use those instead of the roboflow dataset. Although the dataset we found seems to be good enough, we would have prefer to do this work ourselves to prevent any mistakes or error the people who developed the dataset might have committed.
Build partnerships:
We would strengthen our ties with local governments, non-governmental organizations, and community leaders. These partnerships would be key in turning our data into actionable plans that can directly benefit the people most at risk.
Bibliography
Bacon Brengarth, L., & Mujkic, E. (2016). WEB 2.0: How social media applications leverage nonprofit responses during a wildfire crisis. Computers in Human Behavior, 54, 589-596. https://doi.org/10.1016/j.chb.2015.07.010.
Joshipura, K. J., Martínez-Lozano, M., Ríos-Jiménez, P. I., Camacho-Monclova, D. M., Noboa-Ramos, C., Alvarado-González, G. A., & Lowe, S. R. (2022). Preparedness, hurricanes Irma and Maria, and impact on health in Puerto Rico. International Journal of Disaster Risk Reduction, 67, 102657.
Gobierno de Puerto Rico. (n.d.). Mitigación de riesgos. Recuperación de Puerto Rico. https://recovery.pr.gov/es/recovery-programs/hazard-mitigation-assistance
Rand Corporation. (2018). Hurricanes Irma and Maria: Impact and Aftermath. Www.rand.org. https://www.rand.org/hsrd/hsoac/projects/puerto-rico-recovery/hurricanes-irma-and-maria.html
US Department of Commerce, NOAA, National Weather Service. (2017, September 20). Major Hurricane Maria - September 20, 2017. Weather.gov. https://www.weather.gov/sju/maria2017
Feedback for EY
The challenge at hand was a very interesting one that we would've loved to explore more without such a harsh time constraint. As a team, we found it very appealing to work with object detection even if it was completely new to all of us. If we were to mention an improvement, it would be to try optimizing model training by adjusting epochs and adopting transfer learning can significantly reduce runtime. Also, in our case, we would like to explore with more object detection tools, maybe focus on different types of building depending on the area, and using classes for them to decide on the level of damage and urgency of repair. This would definitely reduce the scope to put focus on the structures that are more dangerous and need more of an immediate assistance.
Besides this, we enjoyed a lot working with satellite imagery, as being completely hands on with what we were doing. It gave us a feeling of acting with a global purpose for the greater good. In summary, while we faced time constraints and the challenge of adopting new technologies, our team's commitment to innovation and social impact has only grown stronger. We look forward to refining our approach, furthering our technical capabilities, and continuing to contribute to meaningful projects with significant global relevance.